Generalized Character-Level Spelling Error Correction

نویسندگان

  • Noura Farra
  • Nadi Tomeh
  • Alla Rozovskaya
  • Nizar Habash
چکیده

We present a generalized discriminative model for spelling error correction which targets character-level transformations. While operating at the character level, the model makes use of wordlevel and contextual information. In contrast to previous work, the proposed approach learns to correct a variety of error types without guidance of manuallyselected constraints or language-specific features. We apply the model to correct errors in Egyptian Arabic dialect text, achieving 65% reduction in word error rate over the input baseline, and improving over the earlier state-of-the-art system.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Comparison of Four Character-Level String-to-String Translation Models for (OCR) Spelling Error Correction

We consider the isolated spelling error correction problem as a specific subproblem of the more general string-to-string translation problem. In this context, we investigate four general string-to-string transformationmodels that have been suggested in recent years and apply them within the spelling error correction paradigm. In particular, we investigate how a simple ‘k-best decoding plus dict...

متن کامل

Chinese Word Spelling Correction Based on N-gram Ranked Inverted Index List

Spelling correction can assist individuals to input text data with machine using written language to obtain relevant information efficiently and effectively in. By referring to relevant applications such as web search, writing systems, recommend systems, document mining, typos checking before printing is very close to spelling correction. Individuals can input text, keyword, sentence how to int...

متن کامل

Typographical and Orthographical Spelling Error Correction

This paper focuses on selection techniques for best correction of misspelt words at the lexical level. Spelling errors are introduced by either cognitive or typographical mistakes. A robust spelling correction algorithm is needed to cover both cognitive and typographical errors. For the most effective spelling correction system, various strategies are considered in this paper: ranking heuristic...

متن کامل

A New Approach for Automatic Chinese Spelling Correction

This article presents a new approach for automatic Chinese spelling error detection and correction. Existing Chinese spelling checking systems have two problems: (1) low precision rate, and (2) lack of correction capability. The proposed Chinese spelling correction method is composed of two mechanisms (1) composite confusing character substitution, and (2) advanced word class bigram language mo...

متن کامل

Chinese Spelling Error Detection and Correction Based on Language Model, Pronunciation, and Shape

Spelling check is an important preprocessing task when dealing with user generated texts such as tweets and product comments. Compared with some western languages such as English, Chinese spelling check is more complex because there is no word delimiter in Chinese written texts and misspelled characters can only be determined in word level. Our system works as follows. First, we use character-l...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014